Word count: 597
This research analysed 2015 sub-region data from Afghanistan, reported by the Department of Health Surveys (Repository 2018). With this data, the spotlight was shone on the potential spatial correlation between female literacy and household access to electricity, which can be a measure of modernisation and development (Desai 2012). QGIS and R were used to explore, visualize and compare the female literacy rates and household access to electricity, across sub-regions in Afghanistan. It is critical to remind readers that this study does not intend to prove statistical causation but is solely intended to visualize a potential spatial correlation.
The study began by using QGIS to read shapefiles downloaded from the Spatial Data Repository. However, there were many fields with incomplete data. These fields include the maternal mortality rates and HIV-prevalence indicators and missing values were indicated by ‘9999’. Since the fields of interest (female literacy: ‘EDLITRWLIT’, household electricity: ‘HCELECHELC’) were complete, no interpolations were required to fill in missing data. Following the data inspection, 2 identical maps were used to produce a choropleth map visualising two variables. To ensure that the map portrayed the correlation in question, the sub-regions with less than 50% of households with electricity were highlighted through rule-based graduated symbology. Bins of observations were created with jenks natural breaks classification. Transparencies and symbols were adjusted and the QGIS map was generated, with appropriate map elements. Finally, the map was embedded into Rmarkdown (Dennett 2018b).
The initial R attempt to produce the map utilised the ‘tmap’ R-package. However, the end-product was considered to be unsatisfactory on an aesthetic level, and could have been more intuitive. Making fine adjustments required an in-depth knowledge of the extensive R-package documentation. After consulting with peers and the ‘leaflet’ practical package by Dennett (2018a), a second attempt was made to maximise the potential of the ‘leaflet’ package. Bins of observations were determined by calculating quantiles instead of jenks, and other map elements were added to produce a more comprehensive and intuitive map. This created an interactive map with pop-up labels and multiple overlays by using the ‘addLayersControl’ function (“Leaflet for R - Show/Hide Layers” 2018). This allows the user to select the variable to be visualised, to zoom-in, and to choose the type of basemap (ESRI gray canvas / topographic map).
With the maps displayed in the above tabs, it is now crucial to evaluate the limitations of the data used. Although we do not have the data collection methodology, we can speculate that one probable limitation is the geo-political situation within Afghanistan which could lead to the under-reporting of female literacy rates. The accuracy of data collection might also be compromised by security challenges. These uncertainties should be factored in further analysis.
The process of using two different approaches to produce a map researching the same question has illuminated the benefits and drawbacks of each approach. The essential process of cleaning, processing and analysing data is significantly easier with R-packages. Furthermore, R is much better at handling large datasets than GIS software. However, minute adjustments in producing a map are easier in QGIS because of the intuitive and friendly user-interface, which ensures a smooth navigation and provides immediate updates with each change. For R, the smallest of changes requires the cartographer to be aware of and edit the intricate R-documentation for each function within each package, before re-running chunks of code. This can be a time-consuming endeavour, albeit becoming more intuitive with experience and practice. Overall, there are advantages and disadvantages with each platform, but these serve to emphasise their complementarity. The choice of program ultimately depends on the structure of the data, the cartographer’s task, experience and preference.
Dennett, Adam. 2018a. “Practical 3 - an Introduction to Using R as a GIS.” https://rpubs.com/adam_dennett/427207.
———. 2018b. “Producing Reproducible Research Using RMarkdown and Github.” https://rpubs.com/adam_dennett/430188.
Desai, V. 2012. “Urbanisation and Housing the Poor: Overview.” International Encyclopedia of Housing and Home. Elsevier Ltd.
Repository, Spatial Data. 2018. “2015 USAID-Funded Demographic and Health Surveys Program of Afghanistan.” Accessed October 30. spatialdata.dhsprogram.com.